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A SAT-based Public Key Cryptography Scheme 

Sebastian E. Schmittner 


Abstract —A homomorphic public key crypto-scheme based on the Boolean Satisfiability Problem is proposed. The public key is a SAT 
formula satisfied by the private key. Probabilistic encryption generates functions implied to be false by the public key XOR the message 
bits. A zero-knowledge proof is used to provide signatures. 

Index Terms —Public key cryptosystems, Cryptographic protocols, Data Encryption, Message authentication. Digital signatures. 
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1 Introduction 

NLIKE the symmertic ones, asymmetric crypto¬ 
graphy schemes predominantly used today are 
vulnerable (at least) to attacks from quantum 
computers using Shor's algorithm. Assuming 
Py^NP and that NP-hard problems can not be solved ef¬ 
ficiently, not even on a quantum computer, cryptography 
based on NP-hard problems is dubbed “post-quantum". 
Daniel Bernstein im lists Hash-based, Code-based, Lattice- 
based, and Multivariate-quadratic-equations cryptography 
as the existing post-quantum algorithms. The aim of this 
paper is to introduce a different crypto-system based on 
the Boolean Satisfiability Problem (SAT). This problem of 
finding a pre-image of 1 under a Boolean function given in 
a certain conjunctive form (see Section]^ is well known to be 
NP-complete. Although a SAT instance can, from a certain 
angle, be viewed as a multivariate-equation, our crypto¬ 
scheme is different from the systems mentioned above. In 
particular, we use a fulfilling assignment as the secret key 
rather than a trap door. 

Using SAT to provide a post-quantum key pair is not 
a far-fetched idea. The main point of this paper is how to 
use them to encrypt and decrypt. To this end, we randomly 
produce Boolean functions which are implied to be false by 
the public key and hence evaluate to 0 on the private key. 
These functions 0 the message bits form the ciphertext. 

The paper is organised as follows: We illustrate the main 
point of using randomness for encryption in a (somewhat 
over simplified) picture in Figure The guiding example 
in Section |1.1| illustrates the main ideas more accurately. In 
Section]^ we describe and discuss our algorithms in detail. 
The probabilistic scheme introduced there is vulnerable to 
an oracle attack, which is discussed in Section as well 
as some other attacks. In this section we also compute 
bounds for the parameters of the encryption to resist all 
mentioned attacks and also explain how the oracle attack 
can be countered. In Section |5] we introduce an identifica¬ 
tion and signature scheme, which is independent from the 
encryption. We conclude in Section]^ 
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1.1 Example 

Before explaining the key generation and encryption al¬ 
gorithm in detail, we start with a siirole toy example to 
guide the readers intuition. A possibl^private/public key 


pair for Alice is: 

pris'= (1,1,0,0,1,0,0) (1) 

puB = (Si V a:2 V X3) A (xi V X4 V X5) ( 2 ) 

A (a;i V cce V X7) ( 3 ) 

since pu6{prvv) = 1. Bob wants to send a bit p S B to Alice. 
He rewrites 

Cl = XiX2X'i (4) 

C2 = Xi 0 X1X4 0 X1X5 0 X1X4X5 ( 5 ) 

C3 = 1 0 xi 0 xg 0 ^7 0 xixe 0 X1X7 0 ^6X7 (6) 

0 X1X6X7 ( 7 ) 

and randomly generates 

-^ 2,3 = X4 (B X5 (B xq (B X4X5 0 X4XQX7 ( 8 ) 

0 X5XQX7 0 X4X7,XqX7 ( 9 ) 

Ri,3 = ^ ® X2X3 ® XeX7 ® X2X3XeX7 (10) 

Ri,2 = 1 0 *4 0 Ts 0 X2X3 0 X4X3 . (11) 

The ciphertext is 

g = cii?2,3 ® C2i?i,3 0 C3i?i_2 (12a) 

= y 0 1 0 0:4 0 CCS 0 a;6 0 0:7 0 xiXe 0 a;ia;7 (12b) 

0 X2X3 0 a;4a;5 0 X4Xe 0 X4X7 0 x^xq 0 X3X7 (12c) 

0 xqX7 0 xiX4Xe 0 X1X4X7 0 xix^xe 0 a;ia;5a;7 (12d) 

0 X2X3XQ 0 X2X3X7 0 X4X5Xe 0 X4X3X7 (12e) 

0 X4XQX7 0 X3XQX7 0 a;ia; 2 CC 3 a :7 0 X1X4X3XQ (12f) 

0 X1X4X3X7 0 X2X3XeX7 0 X4X3XQX7 (12g) 


Bob sends the ciphertext g for the cleartext y to Adice, who 
decodes g{priv) = y. 

Mallet can attack the scheme above by replacing a literal 
in p by a truth value, say X3 ^ \. He sends the modified 
ciphertext to Alice and from Alice's reply, he will notice 
whether Alice received the correct bit or not. Hence he 

1. Of course, the ratio of clauses to variables in realistic keys needs to 
be much larger (see Appendix]^ and also the key length needs to be 
much longer. 
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Figure 1. Encryption (top): The text is hidden in random noise patterns generated from the public key pub. Subsequent additions hide the structure 
of the noise patterns to make it harder to subtract the noise. 

Decryption (bottom): Via the private key, every noise pattern generated from pub is “switched off” and hence the clear text is revealed from the sum. 


concludes that 0:3 = 0 in the private key. Alice might notice 
the attack by a suitable syntax check on the decoded text 
(containing more than one bit), but the security of the key 
pair is lost anyway. 

This attack can be countered as follows. Bob seeds his 
random number generator, which chooses the clauses and 
generates the random functions Ri, with a hash of the salted 
clear text. He then sends the salt together with the encrypted 
message. If the clear text was long enough, it would take 
Mallet very long to guess the right seed from the salt 
alone (hence decoding the message). But knowing the salt 
and the clear text, Alice can easily re-encrypt the message 
after decryption and check whether the ciphertext has been 
altered by Mallet. She rejects any faulty message, regardless 
of the private key, hence not revealing any information. 


2 Algorithms 

ENERATION of key pairs and the encryption al¬ 
gorithm are discussed in this section. The public 
key, pub, is a "planted" random k-SKY instance for 
fc G N with k > 2. This is a Boolean function in 
n G N variables, pub : B" —>■ B, which is given as a conjunc¬ 
tion of m G N many fc-clauses, cy : B^ C B" —>■ B, together 
with a truth assignment, the private key priv G pub~^{l). In 
summary 



m k 

pub : /\ Cj{x) , Cj : a; i-G Y © s{i,j)) , (13) 

3=1 i=i 

where the signs s{i,j) G B determine wether the literal xj 
is negated. To discuss the complexity of the algorithms, we 
use the notation 


0{f) := (5 : N ^ M I limsup 

L n-S-oo J(n) 

o(/) := (5 : N R I limsup 

L n-S-oo J(n) 

and ©(/) := 0{f) \ o{f) for / : N ^ 1 R+. 



2.1 Key Generation 

A key pair is generated as follows: 

1) Choose the private key priv G B" at random. 

2) Generate the clause Cj by randomly choosing k distinct 
integers /yy G {1,..., n} and "signs" Si G B. 

3) If Cj (priv) accept the clause, otherwise reject it|^ 

4) Repeat until m clauses have been generated. 

In other words, the data to be generated and stored for 
the each key pair is 

m} X A:} n} (16) 

and s : {1,..., m} X {1,..., A:} —>■ B , (17) 

hence the length of the public key is in Q{mk{\og{n) + 1)). 
The run time is proportional to the public key lengtHd 

It is very important to generate hard instances as public 
keys. Therefore we must ensure that priv is (almost surely) 
the unique solution, i.e. m needs to be larger than the critical 
value, m > rric = ac{k)n for some a^k) G 0(1), see |2- 
Choosing m rather close to rric yields the most difficult in¬ 
stance. This is well known in similar setups, see Appendix]^ 
for some of our own benchmarks. Choosing m oc. n for fixed 
k leads to a public key length of 0(nlog(n)). A different 
scaling of m with n is not recommended, since the resulting 
SAT instances then become much easier to solve. 

2.2 Encryption and Decryption 

We fix parameters a, /3 G N with 2 < /3 ^ a to tune the run 
time and security of the encryption. 

1) Choose a many tuples of /3 many distinct clauses at 
random, J : {1,..., a} x {1,..., /3} —>• {1,..., m} such 
that J{i, a) ^ J(f, h) for all a b. 

2) The ciphertext encoding ?/ G B is the algebraic normal 
form (ANF) of 

a /3 

g = y ® Cj{i,a) A i?i,a • (18) 

i—\ a—1 

2 . If the algorithm is to have finite worst case run time, one should 
modify the clause instead of generating a new one at random. Notice, 
however, that the number of signs to flip has to be chosen carefully if 
the resulting distribution of public keys is to be unchanged. See e.g. (5|. 

3. If clauses are rejected this is tha case with probability exponentially 
close to 1. 
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Here a is a random function. It depends on the same 
set of variables as the clauses {cj(i f ,)|6 ^ a} and is 
generated in ANF where each possible term occurs with 
probability 1 / 2 . 

3) The decryption "algorithm" is g{ptii’) = y. 


In choosing the clauses in Step some care is to be taken: 

(a) To counter attacks discussed in Section 13.1.21 it is prefer¬ 
able if the clauses within one tuple share variables. This 
is particularly important if one of the clauses does not 
contain negations, i.e. s{i, 1 ) = ... = s{i, k) = 0 . 

(b) We have to ensure that each tuple shares at least one 
clause with another tuple to counter the attack dis¬ 
cussed in Section 152] 

(c) Each clause of pu^ is to appear in some tuple as dis¬ 
cussed in Section [331 


We can ensure [(b)| and j^ by choosing J as follows: First 
choose a permutation a G Sm at random. Then set J{i,a) = 
a{i + a—l), where the indices are understood modulo m. In 
this scheme a = m. To also ensure (a) one should enhance 
the probability of neighbouring clauses, and Co.(j)+i, to 
have variables in common when generating cr. 

To turn the negated clauses in l |T 8 | l into ANF, one can use 
various identities, such as xVy = x(By(B xy, but xW y = xy 
is the most efficient one here. It leads to 


1 © (xi 0 si) V ... V (xfe 0 Sfc) (19a) 

= (xi 0 Si 0 1) A ... A (xfc 0 Sfe 0 1) . (19b) 

The resulting ANF (obtained by distributing A over 0) is 
of length < 2^. The run time to compute the ciphertext in 
( flS) as well as the length of the resulting ciphertext are in 
0{a2^^). Consequently, we need to choose f3 G 0(log(TO)) 
in order for the run time to be polynomial. The length can 
be expected to be shorter than the run time by a factor > 2 
due to the cancellation of terms in the sum and due to Ri a 
only containing terms on average. If we choose 

(3 = \og 2 {m)/k and a (x m, we have ex m3. For /3 G 
0 (1), run time and ciphertext length formally only scale 
with TO, but for to « 2^° and /3 = fc = 3 the pre-factor is 
still comparable to to. In short, encryption with this setup 
has complexity 0{m3). 

3 Known Attacks and Improved Schemes 

BVIOUSLY, the most pressing question is whether 
the cipher introduced in Section]^ is really (post¬ 
quantum) secure. We will not answer this ques¬ 
tion, but discuss some attacks and counters in this 
Section. Denote by Gf, = Gb{pu6) the set of all possible 
ciphertexts encoding 6 G B and by G = Gi U Gq the set 
of all ciphertexts. Since the problem of deriving the private 
from the public key is known to be hard, we will focus 
on attacks on the ciphertext. We will refer to the problem 
of deciding for g G G whether g G Gg or g G Gi as 
the decoding problem. We do not have a proof that this 
is a hard problem, although some related problems are (see 
Appendix [^. In this section we focus on some particular 
attacks, i.e. algorithms that solve the decoding problem and 
show that all of them have exponential run time, if the 
parameters of the encryption algorithm are chosen suitably. 



3.1 Simple Attacks 

3.7.7 Enumeration of Ciphertexts Attack 

Since the set of all possible ciphertexts for a given public key 
is finite, a trivial brute force attack is to just enumerate it. In 
order to prevent this attack, the encryption parameters a 
and /3 need to be large enough. More precisely, the number 
of possible ciphertexts is roughly 

2“/3((/5-i)fc) ^ ^20) 

We can ensure this number to be (super) exponentially large 
by choosing 

n G 0{a/3) . (21) 

Further we should apply some fixed invertible linear trans¬ 
formation to the message bit vector before encoding. This 
way, decoding a few of the ciphertext bits does not reveal 
any clear text bits. 



3.1.2 Constant Term Probabiiity Attack 

Another rather trivial attack is to determine g(0,..., 0) for 
a ciphertext g G G. This is just the presence or absence of 
the constant 1 in the ANF, which is where the message bit 
enters. Hence this property must not distinguish Gg from 
Gi. 

The ANF of a random clause of length k contains a 
constant 1 with probability 1 — 2 “^, hence the probability 
for each summand in | |T 8 } to contain the constant 1 is 
2“^“^. This is rather small, but a variant of the central limit 
theorem is in this case on our side, see Appendix As a 
consequence, the probability for p( 0 ,... , 0 ) = 1 tends to 
1 /2 exponentially quickly with a/3 at the scale of 

a/3 > 2^= . (22) 


Considering the enumeration attack from Section 3.1.1 the 
bound established in is much stronger than l|22|. In 
other words, the cipher resists the constant term attack if 
parameters are chosen such that it resists the enumeration 
attack. 

A more refined version of this attack focuses on those 
clauses which can contribute a summand 1 , namely those Ci 
which do not contain any negation. For those. 


Ci = (xii©!)... (a:ij^0l) = I0a;ii0. • -^Xi^Xi^ ■ --Xi^ ( 23 ) 


contributes a constant to the ciphertext sum iff the corres¬ 
ponding Ri also contains the constant 1. An attacker can try 
to judge wether the latter is the case by looking for the order 
k term, Xi^Xi^ ... Xi^, in the ciphertext sum. The same term 
could be caused by another clause depending on the same 
variables, but such a clause is part of pub with vanishing 
probability for large m (x n. Hence we need to ensure that 
Ci shares at least one variable with another clause in the 
same tuple or that another clause in the same tuple contains 
no negated literals. Both of these can lead to the appearance 
of the same order k term, see the example in Section |lT| 
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3.1.3 Ciphertext Value Probability A ttack 
An attacker can try to estimate |{5“^(1)}| for any g G G 
by evaluating g on a random set of trial inputs. Hence 
this value must not distinguish Gi from Gq. This attack is 
structurally very similar to the constant term attack (Sec¬ 
tion |3.1.2|l. For each clause c of length k, the number of 
inputs on which c is 1 is |{c“^(l)}| =2*^ — 1. The prob¬ 


ability of all summands in the ciphertext g in Equation (IS) 
evaluating to 1 on a random input is the same as for none 
of the summands to contain a constant 1 , if i? = 1 with 
probability 1/2. Even if the Rs are chosen such that they 
evaluates to 1 only on cx 2 “^^' many inputs, choosing 


a ^ k 


(24) 


still ensures that the probability for g{l) = 1 is exponentially 
close to 1/2 according to Appendix]^ Since /? G 0(log(m)) 
for run time reasons, condition is again ensured by pT). 


3.2 Decoding Attack 

More sophisticated attacks exploit the structure of the 
cipher. Any function in 5 G G is necessarily of the form 


g = y © 


i=l 


for some functions Ri (see Appendix [^. A crucial point 
of our whole scheme is that polynomial long division does 
not work over finite fields such as B, because the degree is 
not well behaved under addition and multiplication in the 
polynomial ring. This can be seen in the Example 


1.1 


A 

simpler example is = x and hence e.g. x^y(By^x^ = 0 . 
Such "collisions" are sufficiently likely to secure our cipher, 
if the random functions Ri are chosen in a good way. 

Attacks on the structure of the encoding algorithm start 
with the following observation. The set of all clauses which 
depend only on literals in a small set {xi ^,..., } has an 

expected size of 


m 

m k 


(25) 


which is exponentially small for M G o{m). In other words, 
a small set of clauses G is most likely uniquely determ¬ 
ined by D{C) := UciGC^(©) where D{c{xi,... ,Xk)) = 
{xi,... ,Xk} denotes the variables on which c depends. 
This means that most likely all tuples J(*, 1),..., J(i, /3) 
used to generate our ciphertext g in Section |2.2| can be 
identified from the ANE of g. Eor a given tuple, this leaves 
2 /3(/3-i)fc choices for the random functions associated with 
it. If we choose /3 = log 2 (m)/fc (see Section |2.2[ l then 
2 /3(/3-i)fc ~ This is super polynomial in m, but, 

for reasonable values of m, an attacker can still produce all 
possible summands associated with this tuple and try to add 
them to g. He can check whether all terms involving vari¬ 
ables from this tuple cancel. However, the same variables 
will occur also in other tuples. In fact, we should ensure 
that also some of the clauses from this tuple appear in other 
tuples. This will leave the attacker with a few possibilities 
for the random function, which can only be decided once 
the touples sharing clauses are decided as well. But since all 
tuples are connected (indirectly) by sharing clauses, the run 
time of this attack is exponential in a which we chose to be 
of the order of m. 


3.3 Reduced SAT Problem Attack 

If an attacker can learn that only a certain subset of clauses 
was used in the cipher, he can try to solve the SAT problem 
given by the conjunction of only those clauses. If the ratio of 
used clauses to variables appearing in those clauses is small 
enough, the SAT problem is easily solvable and any solution 
can be used to decode the ciphertext. 

To resist this attack, care is to be taken in Step Q of 
the encryption (Section |2.2) . The set of all clauses used, 
G = Ur=i Uf=i{cj(i,a)} should not depend on more than 
|G|n/TO many different variables. Eurthermore, reducing the 
number of clauses used, even at constant clauses to variables 
ratio, effectively reduces the key length. Overall, we should 
ensure that all clauses are used in the cipher. 


3.4 Oracle Attack 

The communication according to the algorithms from Sec¬ 
tion [ 2 ^ is vulnerable to the following attack: If the recipient 
is expected to send a reply to an encrypted message, an 
attacker can fake a ciphertext by replacing a literal with a 
guessed truth value. If he can learn from the reply whether 
the ciphertext was correctly decoded, he gets a strong hint, 
or even evidence, for the assignment of this literal in the 
private key. Repeating the attack in a suitable scheme will 
reveal the full private key. This is a severe attack which 
limits the scope of application of the random cipher to such 
cases where replies are only send to authenticated com¬ 
munication partners. Post-quantum authentication could be 
provided by e.g. a hash-based scheme, see IjT] Hash-based 
Digital Signature Schemes]), or by our identification scheme 
introduced in Section One could also use only one¬ 
time key pairs for encryption, again with the problem of 
authenticating the new keys. 

Instead, we have developed two improved version of 
our scheme. The one which we consider superior resist the 


oracle attack completely and is described in Section 3.4.1 


In special circumstances, also the version described in Ap- 
pendix [P] might be useful. The latter makes the oracle attack 
substantially more difficult and preserves the probabilistic 
nature of the cipher. 


3.4.1 Proof of Honest Encryption 

The version of the encryption/decryption scheme described 
here prevents the oracle attack completely without increas¬ 
ing the complexity of the encryption algorithm. Key gen¬ 
eration is unchanged, but decryption becomes as complex 
as encryption and the feature of a stochastic cipher is lost. 
More precisely, using the cipher discussed in Section |2.2| 
the ciphertext is not a function of the public key and clear 
text. In particular, even if Bob encodes the same clear 
text twice, he will get different ciphertexts. This makes 
repetition-attacks impossible and the algorithm is very res¬ 
istant against rainbow table type attacks. It is, of course, 
only pseudo random. In other words, the ciphertext is a 
function of public key, clear text and the seed of the pseudo 
random number generator (PRNG), implicitly used to make 
the random choices. This can be used to verify that the 
ciphertext was not tempered with in order to oracle the 
private key, at the cost of loosing the rainbow-resistance. 
The latter can then be restored in the usual way by salting. 
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Concretely, key generation as described in Section |2.1| 
stays untouched and also encryption is done as explained in 
Section but the sender starts by seeding the PRNG with 
a specific seed, computed from clear text and a salt. The 
salt is then to be part of the ciphertext. After decryption of 
the ciphertext (applying it to the private key), the recipient 
computes the seed from the clear text and the salt. He then 
checks that the received ciphertext text matches that one 
that is computed from the public key, the clear text, and 
the seed according to the fixed encryption algorithm. This 
way, oracle attacks can be detected without revealing any 
information. 

The cost to pay is that the implementation of the PRNG 
and the encryption algorithm on the sender and receiver 
side have to match and that the receiver has to re-do 
the most time consuming part of the whole scheme, the 
encryption. Further more, the clear text needs to be long 
enough (of order n) in order to prevent a brute force attack 
on the seed. Notice that the seed, as computed from the salt 
and clear text, is to be considered as a key for decoding the 
particular message. If an attacker can find the seed he can 
decode the ciphertext without the private key. 


4 Homomorphic Encryption 



j HE cipher described in Section 2.2 is fully homo- 
' morphic. That means that if Alice wants to know 
i the value of any function / : B" —>■ B" on x G B", 
)She can encrypt x bit-wise into the ciphertext 
c = (ci,...,c„) : B" —>■ B” as above and then send c 
to Gharley, who has more computation power available. 
Gharley computes foe and sends this processed ciphertext 
back to Alice. She decrypts 


(/ o c) {pnv) = f{c{pnv)) = fix) 


(26) 


Notice that oracle attacks by Gharley can not be 
countered by a proof of honest encryption requirement in 
this setting (see Section [3^ , or else Alice would have to redo 
the whole computation. Anyway, due to the cipher being 
malleable, trust in Gharly to honestly compute / is needed. 
One might also use multi-Key encryption as described in 
Appendix to check that Gharley computes honestly and 
discard the key-pair otherwise. 

A more severe drawback is that any computation on the 
ciphertext “bits" Ci, while not introducing any noise, does 
increase the length of the ciphertext. In particular, if \ci\ 
denotes the number of summands in the ANF of Ci, then 
\ci®Cj\ < |ci|-|-|ci| and jciAcjl < |ci| |ci|, where the bounds 
will likely (almost) saturate for short ciphertexts. As soon as 
about half of all possible terms are present, i.e. |ci| « 2"“^, 
the length will likely not grow further. But since originally 
\ci\ oc m oc n, this means that multiplication leads to an 
exponential growth in the ciphertext length. Consequently, 
only few multiplications are feasible in our scheme. 

This growth in length can be traded for "noise" by 
discarding all terms longer (counting the number of literals 
in a product) than a fixed length L. The probability of such 
terms being satisfied by the (unknown) private key is 2“^, 
i.e. if only <C 2^ terms are discarded, the ciphertext probably 
still decrypts correctly. The final length of the ciphertext 
bits is then limited to 2^, but the "noise" generated by 


multiplications will at some point make it impossible to 
decrypt the ciphertext, hence effectively also limiting the 
number of possible multiplications in /. In this sense our 
scheme is only somewhat homomorphic and bootstraping 
is needed to make it fully homomorphic without the cipher- 
texts becoming unfeasibly large. 


5 Identification and Signatures 

N this sections we introduce a zero-knowledge 
proof for our key pairs. This leads to an identific¬ 
ation and signature scheme, which is independ¬ 
ent from the encryption scheme discussed above, 
apart from sharing the same type of pub lie/private key 
pairs. In fact, the scheme proposed here is a variant of the 
one by Blum based on the Hamitlon cycle problem |2l. The 
latter is NP-complete, hence polynomial-time equivalent to 
SAT, but one can apply the ideas more directly. 



5.1 Zero-Knowledge Proof/Identification 

Let pu6 again be a fc-SAT instance in n boolean variables 
and priv G pu6~^il) a solution only known to Alice. The 
zero-knowledge proof consists of K rounds, each of which 
proceeds as follows: 

1) Alice chooses a random invertible function / : 

B" —>■ B" and commits to f*{puS) = pu6 o / 
and to f~^{priv). More precisely, the commitment 
to f*{puS) is to be in the form of committing to 
each literal in the pull back of the form, i.e. to each 
/*( 2 : 7 ( 1 ,1)), /*(X7(1, 2)),..., f*ixiim, k)). 

2) Bob chooses whether Alice should reveal either 
2.a) all of /* ipuB) or 

2-b) f-^ipriv) together with {/*(a:/(i,ai))l* ^ {!,... , m}}, 
where Alice randomly chooses the G {1,..., fc} such 
that /*(x/(i^a.)) {f~^ipnv)) = 1 for all i. 

In this scheme Bob either verifies that Alice did not cheat 
when generating f*ipu6) or that she indeed knows a solu¬ 
tion. Fake proofs will hence be discovered with probability 
1/2 each round and the probability to get through all rounds 
is 2-^. 

The run time of the above scheme depends on what kind 
of functions we allow for /. If we only use permutation^ 
the run time for generating the permutation, its inverse 
and reshuffling priv is 0(n) and C)(m) for re-labelling pub. 
However, in this case the number of Is and Os in the public 
key is leaked by revealing f~^{priv). To avoid this, we 
should at least add a random affine shift s G B" to the 
permutation, i.e. fix) = tT(x) -I- s. 


5.2 Signatures 

By a Fiat-Shamir heuristic we can convert the above inter¬ 
active identification into a signature scheme. To this end, let 
/i : B* —>■ B^ be a cryptographic hash function. To sign a 
document: 


1) Alice g ener ates K random functions, /i,..., fK, as in 


Section 


5.1 


and commits to all ipriv) and f*ipuB). 


4. Here the permutation acts on B” by permuting the coefficients 
with respect to the fixed base and correspondingly on formulas by 
permuting the indices of the variables. 
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2) She then computes a hash of the document to be signed, 
concatenated with (a suitable encoding of) her commit¬ 
ments. Denote the latter by C. 

3) She reveals the information as in Step of Section 5.1 


choosing 2.a) or 2.b) in the ith round if the ith bit of 


the hash is 0 or 1, respectively. Denote this revealed 
information by R. 

The signature consists of C and R. It is verified by Bob 
by checking that the correct information was revealed by 
re-computing the hash. He also checks that the revealed 
information is valid as in Section [Sd] of course. 

To fake this signature, Alice could fake the zero- 
knowledge proofs and generate signatures until, by chance, 
a valid one is generated. However, the probability for this to 
happen is as small as cheating in the identification scheme, 
i.e. exponentially small in the number of rounds. 



6 Conclusion 

j HE motivation for developing the public key 
‘crypto-scheme based on the Boolean Satisfiab¬ 
ility Problem discussed in this paper is to fill 
; the arsenal of post-quantum cryptography with 
some fresh ammunition. The simplicity of the scheme de¬ 
veloped in this paper might be an advantage over the well 
known post-quantum schemes. More conceptually, we are 
not aware of any scheme which uses random ciphertexts 
in a similar way (compare to Figure]^. Furthermore, there 
is quite some freedom in the details of the cipher, in par¬ 
ticular in choosing probability distributions for the random 
functions Rj. This makes it possible to adapt the algorithm, 
if more sophisticated attacks are discovered in the future. 
The cipher presented in Section |2.2| has undergone some 
evolution to resist all attacks which came to our mind. The 
version using a proof honest encryption has no vulnerab¬ 
ilities currently known to us, but of course much more 
crypto-analysis is needed and the reader is invited to devise 
stronger attacks to challenge and improve the algorithm. 
In particular, we have not proven that it is (NF-)hard to 
decipher a message without knowledge of the private key. 
It is the problem of deducing the private from the public 
key that is NF-hard, i.e. “post-quantum", by construction. 
The same is true for the signature scheme presented in 
Section 5.2 which is independent of our encryption scheme. 
It is analogous to Blum's well known scheme j2j, adapted 
to our SAT scenario. 

A notable feature of our cipher is that, in principle, it is 
fully homomorpic, i.e. applying any function to the cipher- 
text bit vector and then decoding yields the same result 
as decoding and then applying the function. However, as 
discussed in Section]^ the ciphertext might become unfeas- 
ibly long if to many multiplications are applied to it, hence 
our scheme is effectively only somewhat homomorphic. 
The oracle attack mentioned in Section 3.4 is a stronger 
incarnation of malleability. It is a severe generic attack on 
any cipher consisting of Boolean functions. Enforcing honest 
encryption, as explained in Section 3.4.1 is a generic counter. 
The multi-key version of our scheme described in Ap¬ 
pendix]^ is another work around, however, not completely 
resistant. Although it only reveals much less information, 
the key pairs still need to be changed regularly, but here 


this could here be feasible. In some special situations, like 
using our scheme for homomorphic encryption, the multi¬ 
key version might be preferable. Encryption is considerably 
more complex than key generation and decryption, which 
in particular yields some protection against DOS attacks for 
multi-key schemes. 

The length of the public key and the run time of the 
key generation algorithm scale as 0{n log n) with the length 
of the private key n. The length of the ciphertext and 
the run time of the encryption algorithm per bit of clear 
text scale as 0{w}^^) with e > 0 and we have identified 
2ac{k)n > m > ac{k)n as the relevant parameter range for 
hard planted SAT instances (see Appendix [^. The crucial 
question is how big n should be in practice today. At the 
SAT Competition 201^ random 3-SAT instances of size 
n « 10^ have been solved for m « rric in less than one hourj^ 
These instances were selected in order to be solvable within 
that time, but still they are randomly generated with non- 
negligible probability. Our own benchmarks (Appendix 
show that the MiniSat solver |l^ can not break keys of length 
n > « 10^ on a modem FC using one thread. Taking 

large scale parallelisation into account, we should at least 
choose n > 2^^, but in view of the SAT Competition results, 
n > 2^^ seems more advisable. 


Key generation is very fast and choosing even n > 10^ 
is not a problem here. However, encryption with our not 
very much optimised proof of concept implementation ||6l 
already takes time of the order of few seconds per bit of 
clear text to encode with a — m — 5n — 5x 2^° and 
/3 = 3 and still tenth of seconds per bit for /3 = 2 and the 
other parameters as before. Although there is certainly room 
for improving the implementation, the constraint /3 > 2 
means that the run time of the encryption algorithm is in 
with e > 3/5 for m « 2^°. These run times are 
not yet very well suited for practical applications. However, 
the situation could be improved by advancing away from 
bit-wise encryption. Encoding longer messages at once is 
an interesting topic for future research. One could use the 
(vector space) embedding of polynomials with binary coef¬ 
ficients into polynomials with coefficients in any finite field 
Fg to encode the message in a shorter g-adic representation 
decimal by decimal. However, increasing q makes polyno¬ 
mial long division more likely possible, hence the security 
considerations in Section 3.2 need to be seriously reviewed 


in this setting. If one can counter the attack from Section 3.3 


by other means than using all clauses, smaller a would 
also lead to a substantial speed up. Anyway, before more 
research builds on the ideas presented here, we would like 
our algorithm to be challenged by more advanced attacks 
in order to improve it. Eurthermore, the problems which we 
have proven to be NF-hard (see Appendix |A| are still not 
very close to the decoding problem. To build faith in this 
cipher really being post-quantum secure, more research in 
this direction is in order. 


5. http://www.satcompetition.org/2014/ 

6. E.g. http://satcompetition.org/edacc/scl4/experiment/24/ 
result / ?id=23531 
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Appendix A 

Some Hard Problems 

In this section, we establish that some problems related to 
the decoding problem (see Section]^ are hard. To this end, 
fix a public key pu6 and assume that (1) = {priv}. The 
partitioning of ciphertexts G = Go U Gi is characterised by 

Gi = {g G G \ puS ^ g} (27) 

Go = {g & G \ pu6 ^ g} . (28) 

Deriving the private key is harder than decoding, since 
Vg G G : g G Gg(^priv)- Assuming that all g S G are given in 
such a form that evaluating g{priv) is possible in polynomial 
time, the decoding problem is in NP. 

If G would contain the elementary functions X := {x ^ 
Xi] then the decoding problem would be (polynomial time) 
equivalent to deriving the private key, hence solving the 
SAT problem. A set of functions Y of polynomials size 
will be called “hard to decode", if solving the decoding 
problem for each f GY determines (by a polynomial time 
algorithm) the solution of the SAT problem. The following 
sets of functions are hard to decode: 

1) The elementary functions X 

2) Any set of functions containing a subset that is hard to 
decode 

3) {/® S/I/ GY} where s/ S B are known and Y is hard 
to decode 

4) {f ® 9 \f G Y} where Y is hard to decode and g is any 
Boolean function 

5) {/A 51/ e y} U {/A 51/ G Y} with y, 5 as above 

6 ) {xi A /I/ G Y,Xi G X} with X, Y as above 

7) {xi V /I/ GY,XiG X} with X, Y as above 

0 is hard to decode since / ©sGG{,-Gt/G Gt^s- @ is 
hard to decode since we can decide for each / © 5 whether 
or not it is implied by pu6, then assume that g{priv) = 0 
and use that Y is hard to decode to produce a trial solution 
priv'. If pu6{priv') 1 then g{priv) = 1 and we can use 

§ is hard to decide since one of the two sets evaluates to 
{0} on priv. Since there are only two possibilities for g{priv) 
one can check both in polynomial time, similar to In (|^ 
(and similarly in ij^) we can decide {xi A /}. If this turns 
out to be a subset of Go then we likely have xi{priv) = 0 
and we proceed deciding {x 2 A /}. After polynomial time 
we either arrive at some Xi (priv) = 1 and can hence derive 
priv or conclude that priv = (0 ,..., 0). 

The above construction shows that some generic prob¬ 
lems similar to the decoding problem are hard to decide. 
Notice, however, that the sender can not (on purpose) en¬ 
code a message of polynomial length into a hard to encode 
set of functions in polynomial time, or else he would solve 
the SAT problem. 


Appendix B 

Boolean Central Limit Theorem 

Consider a set of i.i.d. Boolean variables Xi, with pi := 
prob(a;i = 1). Then 


/ M \ LM/2J / X 

:= prob = oj = ^ 

(29) 

pf := prob = ij = ^ ^ ^yi^+\l-pi) 

(30) 

and hence |p® — p®| = |1 — 2pi|^ converges to 0 for any 
0 < Pi < 1 and M —t 00 at exponential speed. In other 
words, for large enough M we have p® « 1/2 irrespective 
of pi. Large enough here means M ^ — l/log|l — 2 pi|, 
which means M ^ p /^/2 for small pi <C 1 / 2 . 

Appendix C 

Structure of Ciphertexts 

In this section we discuss which functions 5 can, in prin¬ 
ciple, be constructed from the public key pub without 
knowledge of the private key, such that pub 5, i.e. 
5 G Gi. We will therefore assume that only the clauses Ci 
of pub = Ati Ci can be used as the elementary building 
blocks for which pub ^ Ci is known. 

Any encryption algorithm of the type discussed in this 
paper will lead to sets of ciphertexts G and Gf, which are 
contained in the sets G and Gb, respectively. The latter are 
constructed as follows 

1 ) 1 G Gi and Ci G Gi for i G { 1 ,..., m}. 

2) / G Ga,5 & Gb ^ f (B g G Ga©6r / A 5 G Ga/\b and 
/ V 5 G Gqv 6. In particular, 5 = 1 © 5 g Gj. 

3) / G Gi, 5 arbitrary ^ / V 5 G Gi and f A g G Go- 
The sets that can be constructed from Q using l|^ and 

0 are 

Go = |Ac,/, I A (31) 

Gi={1©/|/gGo}. (32) 

These are stable under since fVg = l©/5 = /©5®/5 
and / = 1 © /. Further, the set of Boolean functions used in 
Q is complete. 

Appendix D 
Multi-key Chains 

In some situations, the proof of honest encryption method 
laid out in Section [y 4 | might not be favourable, for example 
if the feature of a (pseudo) random cipher is crucial or re¬ 
computing the encryption on the receiver side is to costly. 
In such cases, we can still substantially weaken the oracle 
attack. The key idea here is to introduce redundancy, which 
increases key length and all run times by a constant factor. 

The private/public key chain now consists of 7 G N 
different private/public key pairs. To encrypt a message, 
each bit is to be encrypted with each of the public keys, i.e. 
the ciphertext for one bit now consists of 7 many Boolean 
functions in ANF. In a valid ciphertext, all of these functions 
evaluate to the same value upon inserting the respective 
private keys. 

After decoding, the recipient has to decide whether or 
not to accept the message (bit), if the ciphertext is invalid. 
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Rejecting all invalid ciphertexts would reveal as much in¬ 
formation about the private key as the simple version of 
Section]^ Therefore we fix a threshold t S {2,..., 7 / 2 } 
and accept the bit with value given by the majority, if the 
minority is smaller than t. To properly choose t, we consider 
the two possible outcomes of the attack: 

1) If the message is rejected, the attacker learns that more 
than the (known) threshold t of the bits he guessed did 
not match the private keys. 

2) If the message is accepted, the attacker will learn 
whether or not he guessed the majority of bits right 
(from the reply of the receiver). 

To keep the information leakage about the private key 
as small as possible, we should choose t large in order to 
make 0 unlikely. More precisely, if / G {0...., 7 } of the 
encoded versions of the bit are manipulated by replacing 
one variable with a guessed value (such that the change in 
the function influences its value), the probability of rejection 
is 



which can be expressed through the error function for large 
/. If / is close or even equal to t, the attacker gains substan¬ 
tial information, but for this attack, the success probability 
is exponentially small in t. 

In the opposite case of choosing / close to 7 , the attack 
is most likely detected if t is small enough. Concretely 
choosing e.g. 

(34) 


Appendix E 
Benchmarks 

We have conducted some simple bench marks for breaking 
the public key, i.e. solving the SAT instance, using MiniSat 
B- It is well known that random instances are the hardest 
for m « rric = ac(k)n. For the planted instances that we use 
as public keys, it turns out that the hardest instances have 
slightly more clauses. 

E.1 3-SAT 

For k = 3 we find (see Figure to that the hardest 
instance have m ~ 'bn where rric ~ 4.2n H. In Figure || 
we show the run time of MiniSat for various random 
instances generated by our key generator as a function of 
the number of variables for fixed m/n = 4.3. This is very 
close to the critical ratio. Fitting an exponential function 
to the minimal run times, we extrapolate that one should 
choose nioo ~ 1500 to ensure that the MiniSat solver 
would take at least 100 years to break the public key. 
For m/n = 4.5, as displayed in Figure we find the 
same nioo within the error of our approximation, but for 
m/n = 5 (Figure]^ nioo < 1000 is significantly smaller. 
Increasing m/n further leads to an increases in nioo which 
again reaches nioo ~ 1500 for m/n = 8 . Consequently, 
we choose m/n = 5 and a private key length n = 2^° 
for fc = 3 as the default values for the proof-of-concept 
implementation |1^. This leads to a public key length of 
about k * m* (log 2 (n) -F 1) « 165 kbit. Notice that the keys 
generated with these parameters can be considered “PC- 
secure" but securing the keys against more sophisticated 
massively parallel solvers rather requires n > 2 ^^. 

E.2 4-SAT 


with c = 3 means that more than 99.7% (for large 7) of the 
attacks will be detected with hardly any information gain for 
the attacker. More precisely, the probability of a successfully 
attack of this type scales down super exponentially (with the 
complementary error function) and is below 10 “® already 
for c = 6. In practice, choosing f oc 7 with a proportionality 
factor slightly smaller than 1/2 should be most reasonable. 

The attacker gains most information from an attack with 
prob^jf, /) « 1/2, i.e. / « 2t. This attack being rejected or 
not indicates that more or less than half of the private key 
bits were guessed correctly. Gathering this knowledge from 
0 ( 7 ^) attacks in a suitable scheme will reveal the value of 
all 7 bits of the keys with a high confidence level on the 
attackers side. Hence the multi-key hardened version is not 
fully resistant against the oracle attack, but the attack will 
almost surely be noticed before substantial information is 
leaked. This invalidates the authenticity of the attacker in 
an authenticated communication. Even if information was 
leaked, not all keys in the chain have to be replaced, which 
limits the damage of a successful attack. 

One can improve the scheme a little more by also 
choosing 7 at random for each decryption, but still stat¬ 
istical analysis will eventually reveal the private key bits. 
Hence the proof of honest encryption scheme introduced 
in Section |3.4.1| is certainly more secure than the multi-key 
scheme. 


For fc = 4 the critical ratio of clauses to variables is about 
rric = 9.8n. Using m/n = 10 our MiniSat benchmarks 
indicate nioo ~ 350. So the private key size can be reduced 
significantly by using higher k. However, the public key 
size for these parameters (« 133 kbit) is comparable to the 
one for fc = 3. This means that the run time for encryption 
with higher k is significantly longer (exponential in k, see 
Section [Z^ . Overall, it does not pay off to use higher values 
of k. 
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n 


2 ”13+0.033n 
2 ”22+0.036n 



n 


2 ”14+0.042n 
2 ”18+0.032n 


2. MiniSat run times for m/n = 4.3. 



n 

2”18+0.061n 
2”23+0.056n 


Figure 3. MiniSat run times for ra/n = 4.5. 



n 


2 ”16+0.047n 
2 ”21+0.049n 


Figure 4. MiniSat run times for m/n = 5. 


Figure 5. MiniSat run times for m/n = 6. 















CPU time [s] 


11 



n 


2 ”16+0.04n 
2-23+0.046n 



n 


2 ”15+0.032n 
2-17+0.031n 


Figure 6. MiniSat run times for m/n = 7. 


Figure 7. MiniSat run times for m/n = 8. 









